Goto

Collaborating Authors

 earth observation data


Out-of-Distribution Generalization in Climate-Aware Yield Prediction with Earth Observation Data

Chakravarty, Aditya

arXiv.org Artificial Intelligence

Climate change is increasingly disrupting agricultural systems, making accurate crop yield forecasting essential for food security. While deep learning models have shown promise in yield prediction using satellite and weather data, their ability to generalize across geographic regions and years - critical for real-world deployment - remains largely untested. We benchmark two state-of-the-art models, GNN-RNN and MMST-ViT, under realistic out-of-distribution (OOD) conditions using the large-scale CropNet dataset spanning 1,200+ U.S. counties from 2017-2022. Through leave-one-cluster-out cross-validation across seven USDA Farm Resource Regions and year-ahead prediction scenarios, we identify substantial variability in cross-region transferability. GNN-RNN demonstrates superior generalization with positive correlations under geographic shifts, while MMST-ViT performs well in-domain but degrades sharply under OOD conditions. Regions like Heartland and Northern Great Plains show stable transfer dynamics (RMSE less than 10 bu/acre for soybean), whereas Prairie Gateway exhibits persistent underperformance (RMSE greater than 20 bu/acre) across both models and crops, revealing structural dissimilarities likely driven by semi-arid climate, irrigation patterns, and incomplete spectral coverage. Beyond accuracy differences, GNN-RNN achieves 135x faster training than MMST-ViT (14 minutes vs. 31.5 hours), making it more viable for sustainable deployment. Our findings underscore that spatial-temporal alignment - not merely model complexity or data scale - is key to robust generalization, and highlight the need for transparent OOD evaluation protocols to ensure equitable and reliable climate-aware agricultural forecasting.


A Latent Space Metric for Enhancing Prediction Confidence in Earth Observation Data

Pitsiorlas, Ioannis, Tsantalidou, Argyro, Arvanitakis, George, Kountouris, Marios, Kontoes, Charalambos

arXiv.org Artificial Intelligence

This study presents a new approach for estimating confidence in machine learning model predictions, specifically in regression tasks utilizing Earth Observation (EO) data, with a particular focus on mosquito abundance (MA) estimation. We take advantage of a Variational AutoEncoder architecture, to derive a confidence metric by the latent space representations of EO datasets. This methodology is pivotal in establishing a correlation between the Euclidean distance in latent representations and the Absolute Error (AE) in individual MA predictions. Our research focuses on EO datasets from the Veneto region in Italy and the Upper Rhine Valley in Germany, targeting areas significantly affected by mosquito populations. A key finding is a notable correlation of 0.46 between the AE of MA predictions and the proposed confidence metric. This correlation signifies a robust, new metric for quantifying the reliability and enhancing the trustworthiness of the AI model's predictions in the context of both EO data analysis and mosquito abundance studies.


Pre-processing training data improves accuracy and generalisability of convolutional neural network based landscape semantic segmentation

Clark, Andrew, Phinn, Stuart, Scarth, Peter

arXiv.org Machine Learning

This was conducted through trialling and ranking various training patch selection sampling strategies, patch and batch sizes and data augmentations and scaling. We also compared model accuracy through producing the LULC classification using a single pass of a grid of patches and averaging multiple grid passes and three rotated version of each patch. Our results showed: a stratified random sampling approach for producing training patches improved the accuracy of classes with a smaller area while having minimal effect on larger classes; a smaller number of larger patches compared to a larger number of smaller patches improves model accuracy; applying data augmentations and scaling are imperative in creating a generalised model able to accurately classify LULC features in imagery from a different date and sensor; and producing the output classification by averaging multiple grids of patches and three rotated versions of each patch produced and more accurate and aesthetic result. Combining the findings from the trials, we fully trained five models on the 2018 training image and applied the model to the 2015 test image with the output LULC classifications achieving an average kappa of 0.84 user accuracy of 0.81 and producer accuracy of 0.87. This study has demonstrated the importance of data pre-processing for developing a generalised deep-learning model for LULC classification which can be applied to a different date and sensor. Future research using CNN and earth observation data should implement the findings of this study to increase LULC model accuracy and transferability.


This data scientist wants to address human issues around AI

#artificialintelligence

Data scientist Oisín Boydell is working on a project that seeks to democratise access to an ever-increasing volume of Earth observation data. Dr Oisín Boydell is principal data scientist and head of the applied research group at CeADAR, the SFI-funded centre for applied AI at University College Dublin (UCD). His primary research interests include trustworthy AI, deep learning, natural language processing and applications of AI to Earth observation data. After working as a software developer in the UK, Boydell returned to UCD to undertake a PhD in computer science, researching novel approaches for personalised information retrieval. Prior to joining CeADAR he worked with SMEs and multinationals on big data analytics and machine learning solutions for the telecommunications industry.


Artificial Intelligence for Earth Monitoring MOOC

#artificialintelligence

Artificial intelligence (AI) is playing an increasingly important part in our daily lives, whether it is providing our personalised social media feeds, online shopping or streaming movie suggestions, or even the mapping apps that route us around traffic jams. On a bigger scale, AI is already having a major impact on healthcare, finance, farming and many other sectors and its influence is predicted to expand rapidly in the coming years. One area where there is considerable untapped potential for AI is in the field of Earth observation, where it can be used to help manage large datasets, find new insights in data and generate new products and services. With this in mind, EUMETSAT, ECMWF, Mercator Ocean International and the EEA have joined up to develop a new massive open online course (MOOC) on AI and Earth monitoring. The idea for the course is to introduce participants to the wealth of Copernicus Earth observation data and the AI and machine learning techniques that can be used to work with it.


Learning Structures in Earth Observation Data with Gaussian Processes

Mateo, Fernando, Munoz-Mari, Jordi, Laparra, Valero, Verrelst, Jochem, Camps-Valls, Gustau

arXiv.org Machine Learning

Gaussian Processes (GPs) has experienced tremendous success in geoscience in general and for bio-geophysical parameter retrieval in the last years. GPs constitute a solid Bayesian framework to formulate many function approximation problems consistently. This paper reviews the main theoretical GP developments in the field. We review new algorithms that respect the signal and noise characteristics, that provide feature rankings automatically, and that allow applicability of associated uncertainty intervals to transport GP models in space and time. All these developments are illustrated in the field of geoscience and remote sensing at a local and global scales through a set of illustrative examples.


Object Detection and Image Segmentation with Deep Learning on Earth Observation Data: A Review-Part I: Evolution and Recent Trends

#artificialintelligence

Deep learning (DL) has great influence on large parts of science and increasingly established itself as an adaptive method for new challenges in the field of Earth observation (EO). Nevertheless, the entry barriers for EO researchers are high due to the dense and rapidly developing field mainly driven by advances in computer vision (CV). To lower the barriers for researchers in EO, this review gives an overview of the evolution of DL with a focus on image segmentation and object detection in convolutional neural networks (CNN). The survey starts in 2012, when a CNN set new standards in image recognition, and lasts until late 2019. Thereby, we highlight the connections between the most important CNN architectures and cornerstones coming from CV in order to alleviate the evaluation of modern DL models.


Earth Observation data and Artificial Intelligence in support of Journalism

#artificialintelligence

Earth Observation data is valuable for journalist's reports to the public. An example are the maps released in little time during or after the tsunami in Indian Ocean in 2004 or the Fukushima disaster in 2011, accompanying the verbal or text reports of theirs. Taking advantage of the improved temporal frequency and spatial cover of the Sentinel satellite sensors SnapEarth aims to assimilate latest spaceborne retrieved information to support journalists in their work in near real time. In this context, a dedicated services' module aims to leverage on Copernicus monitoring services, like the EMS's (Emergency Management Service) EFAS (European Flood Awareness System) and EFFIS (European Forest Fire Information System). It will add in tandem to them the ability to exploit latest AI (Artificial Intelligence) techniques to automatically and unsupervised query through big data piles to deliver in minimum time required products.


The Future Lies In Machine Learning, says Rogerio Bonifacio of UN WFP

#artificialintelligence

In an exclusive interview with GeoBuiz, Rogerio Bonifacio, Head, Geospatial Anaysis Unit, UN World Food Program, says that The Future Lies In Machine Learning. The World Food Programme is involved in humanitarian systems. The usage of earth observation data has grown manifold in recent years, with more focus on medium- and low-resolution data streams. Geospatial technologies provides wide spatial coverage required to keep track of hazards. Each agency has separate, but coordinated usage of earth observation data.


Powering geospatial analysis: public geo datasets now on Google Cloud

#artificialintelligence

With dozens of public satellites in orbit and many more scheduled over the next decade, the size and complexity of geospatial imagery continues to grow. It has become increasingly difficult to manage this flood of data and use it to gain valuable insights. That's why we're excited to announce that we're bringing two of the most important collections of public, cost-free satellite imagery to Google Cloud: Landsat and Sentinel-2. The Landsat mission, developed under a joint program of the USGS and NASA, is the longest continuous space-based record of Earth's land in existence, dating back to 1972 with the Landsat 1 satellite. Landsat imagery sets the standard for Earth observation data due to the length of the mission and the rich data provided by its multispectral sensors.